Age and gender effects on presence, user experience and usability in virtual environments–first insights

Virtual Reality (VR) is applied in various areas were a high User Experience is essential. The sense of Presence while being in VR and its relation to User Experience therefore form crucial aspects, which are yet to be understood. This study aims at quantifying age and gender effects on this connection, involving 57 participants in VR, and performing a geocaching game using a mobile phone as experimental task to answer questionnaires measuring Presence (ITC-SOPI), User Experience (UEQ) and Usability (SUS). A higher Presence was found for the older participants, but there was no gender difference nor any interaction effects of age and gender. These findings are contractionary to preexisting limited work which has shown higher Presence for males and decreases of Presence with age. Four aspects discriminating this study from literature are discussed as explanations and as a starting point for future investigations into the topic. The results further showed higher ratings in favor of User Experience and lower ratings towards Usability for the older participants.

In the long history of defining Presence, there have been a number of definitions, which vary in concerning their scope [24]. Most definitions define Presence in the context of mediated environments, used to induce a feeling of 'being there' (see section 3.1.1 in [24] [24]. We agree with this reasoning and therefore, follow the definition of Skarbez et al. who defined Presence as "The perceived realness of a mediated or virtual experience." [24]. This definition focuses on the subjective perception of the user and also includes the plausibly of the experience, which goes beyond the feeling of only 'being there' as a solely spatial sensation. According to the ISO9241-210 standard, User Experience is defined as "A person's perceptions and responses that result from the use and/or anticipated use of a product, system or service." [25] and Usability being defined as the "extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use" [25]. For VR training applications, a high User Experience is desirable to enhance the effects of the training [8,9,[26][27][28]. A high User Experience and Usability further help to increase the acceptance of medical VR applications [29]. User Experience is furthermore important for product developers, where it has become a powerful instrument to assess the potential success of a product even before a physical prototype exists. VR can also create a more standardized assessing environment for field experience, whilst at the same time providing a more realistic experience than most other laboratory conditions. Established methods do exist to measure User Experience; however, these methods have been developed to be applied with real prototypes, not for virtual prototypes. Applying such methods in VR evaluation can therefore lead to erroneous conclusions and render the advantages of VR evaluations obsolete.
In a first baseline study, our team could show that VR evaluations on measures of User Experience and Usability may get affected by the Presence users have in VR [30]. Using a previously validated testing environment, which involved a five-sided CAVE, we aimed at assessing if such differences in age and/or gender exist with vast implications on the development of VR environments perceived as being real. It was unclear at this stage of the study if these effects were spread equally among the population investigated, which involved both sexes and a broad age range. There are anecdotal reports on Presence, User Experience and Usability having age-and gender-specific effects. Manifold studies have to date exclusively researched the effects of age on navigation and wayfinding skills, where VR is just used as an experimental environment [31][32][33][34][35]. Furthermore, for research on age related memory capacities, VR is used as an experimentation environment [36]. Gender influence in VR has been researched under the aspect of proximities to avatars [37] or the perception of embodied avatar hands in relation to their gender [38]. A meta-analysis of Peck et al. [39] discussed potential gender bias on simulator sickness and suggested researching whether such biases could be evident for other factors like Presence. Very few studies, however, investigate age and gender effects on Presence [40][41][42]. Additionally, no work so far has looked at both factors in one study. More data on gender and age-related effects on Presence is consequently and urgently needed. Moreover, as questionnaires on Presence, User Experience and Usability are widely established, a more detailed knowledge is necessary for practitioners to interpret their results from VR User Experience studies, and to prevent investigators from drawing false or biased conclusions. For the scientific community, the outcomes of such studies may be helpful to further refine such surveys, thereby making them usable and reliably in VR, or to find corrective values, improving their validity and to compensate for the bias introduced by VR.
Our group could show that connections exist between Presence, User Experience and Usability [30]. We could furthermore show that confounders exist for this connection on a cognitive level, and assessed the effects of low-level ethanol intake on these factors [43]. We decided to build upon these previous studies and further explore the existence of age-and gender-related effects on User Experience measures in a further experimental VR study.
We aimed at investigating the research question of whether any age-or gender-related differences in Presence, User Experience and Usability exist in VR. Main contributions of this given manuscript would be as follows: • This study for the first time investigates age-and gender-effects on Presence; so far only one factor is investigated amongst existing studies.
• The influence of age and gender on User Experience and Usability of an application is for the first time evaluated deploying a virtual environment.
• Specific directions for future research regarding the effects of age and gender on Presence are provided.

Experimental setup
Institutional approval for this study was obtained from the Institute for Machine Tools and Production Processes of the Chemnitz University of Technology. Ethical approval was obtained from the University of Leipzig (number: 251/17-ek), and all participants gave their informed written consent for their participation in the given study. Previous data from an earlier study, obtained under the same ethics protocol number, were also included [30,43]. All experiments were conducted according to the principles of the Declaration of Helsinki.
Age and Gender were used as independent variables as well as eleven dependent variables for Presence, Usability and User Experience. For Gender, participants were split into two groups: female and male [44]. For Age, the groups were defined as follows: 18-32 years of age (Younger group) and 48-62 years of age (Older group), providing a clear separation between the cohorts. Twenty-eight samples were included from a previous study on Presence, User Experience and Usability [30]. A further 29 participants were tested additionally. All participants were recruited using social media channels and mailing lists. To ensure comparability with our previous study [30], we applied an identical study protocol. This protocol was separated as follows: (1) pre-assessment, (2) main study and (3) post-assessment (see Fig 1). At the beginning of the pre-assessment, the participants were welcomed by the principal investigator and received verbal explanation detailing the task they would need to fulfill in the virtual environment. Following this introduction, the participants were asked to read and sign a form declaring their informed consent. For the next step, a demographic questionnaire was provided, asking information about age, gender and educational background, as well as the participants' self-assessed ability to read digital and paper maps. Furthermore, the participants were asked if they had previous contact with VR and geocaching.
Upon completion of the demographic questionnaire, the main study began. The participants were guided to the virtual environment where they received further information about their objectives and the method of navigating within the virtual environment. The experimental task consisted of a geocaching game in the virtual city center of Chemnitz, Germany, using a smartphone application on a physical mobile phone. The geocaching game was implemented using the Actionbound application of Simon Zwick and Jonathan Rauprich GbR [45] and consisted of seven location with a tour length of 1.7 km. The participants were instructed by the information provided on the screen where their next target location within the virtual city center has been depicted. On the same screen, a 2D map of the city center was shown alongside the position of the users, to help provide information to orientate themselves. The user position was updated on the 2D map via an artificial Global Positioning System (GPS) signal, sent by the virtual environment to the mobile phone. Before the geocaching game started, each participant was asked to get familiar with the navigation method. The geocaching task was started once the participants informed the principle investigator that they felt comfortable with the navigation. A five-sided cave automatic virtual environment (CAVE) was used to immerse the participants into the VR scenario (see Fig 2). The CAVE had an edge length of 3 m and was built based on the principles of Cruz-Neira et al. [46,47]. Twenty full HD rear projectors in combination with passive circular polarization enabled stereoscopic vision, and a cluster of eleven computers equipped with NVidia Quadro 6000 video cards rendered the virtual scene. Six optical infrared cameras by ART GmbH (Weilheim i. OB., Germany) tracked the participant's head for calculating their viewpoint. For the navigation, a method developed by Lorenz et al. [48] was used, utilizing a Microsoft Kinect sensor, which tracked the movements of the participants from behind (see Fig 2). After finishing the main study, the participants concluded the post-assessment by answering questionnaires on Presence in the virtual environment as well as User Experience and Usability with the geocaching application. In a last step, the principal investigator debriefed the participants and answered questions about the study. Task completion time was not measured, as it was deemed irrelevant for the initial study design on the assessment of possible age-and gender-related effects on Presence, User Experience and Usability.

Dependent variables: Presence, User Experience and Usability
To measure Presence, a 12-item short form of 'The International Test Commission-Sense of Presence Inventory' (ITC-SOPI) by Lessiter et al. [49] was used. It consists of four scales: 1) Sense of Physical Space, 2) Engagement, 3) Ecological Validity and 4) Negative Effects, as applied previously for this purpose [30]. Each item was rated using a five-point Likert scale. The ITC-SOPI was chosen as it is intended for cross-media usage, which seemed most appropriated given the experiment includes the usage of a mobile phone.
User Experience [50] describes the subjective assessment of a product by a user. The User Experience questionnaire (UEQ) by Laugwitz et al. [51] was used in this study, and is a validated questionnaire to measure User Experience. On a seven-point sematic differential, 26 bipolar items are rated by the user. Out of these items, 6 scales were derived: 1) Attractiveness, indicating a positive or negative attitude towards a product; 2) Perspicuity, 3) Efficiency and 4) Dependability, which are summarized as the pragmatic aspects of a product; and 5) Stimulation and 6) Novelty, which represent the hedonic aspects of the product.
Usability assesses the users' impression of the fitness of use of a product and corresponds to the pragmatic aspects of User Experience. A well-established questionnaire that was used in this study to assess Usability was the System Usability Scale (SUS) by Brook [52] where Usability is rated on a one-to-five Likert scale.

Statistical methods
A Kolmogorov-Smirnov-test was used to check the residues for normal distribution, followed by a Levene-Test to check for equality of error variances. A two-way multivariate analyses of variances (MANOVA) was performed thereafter to find possible age and gender related effects. P-values equal to or less than 0.05 were considered as statistically significant.

Participant demographics
The age of the Younger group averaged 23.9 (SD = 2.8) years and was significantly younger (p < .001) then the average age of the Older group (M = 53.4, SD = 3.8). Participants of the Younger group were significantly more familiar with VR systems (20.7 %) compared to the Older group (0 %, p = 0.01). In the Younger group, 34.5 % of the participants were familiar with geocaching, which is significantly more frequent in comparison to the Older group (7.1 %; p = 0.01). The ability to read a paper map or a map on a mobile phone yielded no significant differences between the age groups (see Table 1).
The Female and the Male groups did not differ significantly in terms of age, their ability to read a map on a mobile phone nor on previous contact with VR systems. A significant difference was found for previous contact with geocaching (p = 0.046), with 30.3 % reported for the Female group and 8.3 % in the Male group. A further significant difference (p = 0.005) was found for the ability to read a paper map, with 72.6 % of the Female group reporting excellent or good abilities, in contrast to 100 % of the Male group (see Table 2).

Age has a main effect but there are no interaction effects between age and gender
The results of the MANOVA in Table 3 show a significant main effect for age (p = 0.009) on the factors of the ITC-SOPI, UEQ and SUS questionnaires with a large effect size (Partial Etasquare of 0.411). In contrast, no gender or interaction effect of Age and Gender on Presence, User Experience and Usability were found.

Presence is slightly higher for older, usability for younger, and user experience for older participants
It was shown that 5 of the 6 User Experience items were significantly higher in the Older group but for only two of these items (i.e. Dependability p = 0.004, Novelty p = 0.023) (see Fig 3). In contrast, Usability tended to be significantly higher for the Younger group (p = 0.036, see Fig 4), indicating that the younger participants found the application more efficient and reliable. No age-related differences were found for Presence except Ecological Validity (p = 0.014, see Fig 5), indicating that the virtual environment was more realistic and believable for older compared to younger participants.

Discussion
This study has underlined that Presence is only minimally affected by age and that no genderand age-interaction effects appear to exist. In the literature, only the effects of age on navigation and wayfinding skills or on memory capacity has been researched, where VR is just used as an experimental environment [31][32][33][34][35][36]. As for researching gender bias in VR, only aspects like the proximities to avatars [37] or the perception of embodied avatar hands in relation to their gender [38] have been investigated. Only in a meta-analysis of Peck et al. [39] a potential gender bias on simulator sickness is discussed. Consequently, these studies cannot be used as a comparison to our results. Very few studies, however, investigate age and gender effects on Presence [40][41][42], for which a comparison to our results of the same can be made. Additionally, no work so far has looked at both factors in one study. Literature investigating the effects of Age on Presence was so far mostly related to the younger population and to a less-well balanced male to female ratio [41]. Further, in one study presented by Kober [41], only one group used a VR setup, and for another group on the elderly, patients with brain lesions were included, forming a significant confounder as brain injury might impair or alter the mechanisms of how Presence is forming. All of the four studies in Kober [41] used different Presence questionnaires and sample sizes to the effect that the results are difficult to compare. Our study used a standardized setup with a larger sample size (n = 57) and a large age range amongst participants to specifically address these issues. Kober concluded that Presence declines with age. However, only the two groups with a smaller sample size (n = 20; n = 21) [41], showed significant effects, out of which one group (n = 21) used a non-VR setup and included brain-injured participants. In contrast, another two groups showed an increase of Presence with aging although not on a significant level. These contradictory results are worth further examination. Although for our analysis we used a multivariate analysis of variances instead of a regression analysis as Kober [41] did, we cannot report lower Presence ratings for the Older group. Rather, our findings on Presence indicated that there were only small age-related differences, since most of the Presence factors showed no significant differences. In fact, the only significantly different Presence factor of Ecological Validity was higher in the Older group. These findings suggest that an older audience may have higher tolerance for shortcoming of the believability and realism of a virtual environment than a younger audience. For developers of VR applications this finding might be favorable as it may lower their development costs. Especially in the field of medical therapy and rehabilitation VR applications, higher Ecological Validity in older persons may pose the risk of a cognitive overload compared to younger people. This difference should also be considered in phobia treatment using VR, as fear-inducing situations might be perceived as more realistic by older patients. The significantly higher Ecological Validity values for the Older group in our study might partially be explained by their lesser VR experience and their assumed lesser experience with virtual worlds in general. Vice versa, Ecological Validity indicates here that younger people seem to be more critical regarding the authenticity or realness of the environment. One could therefore argue that more work has to be put into the creation of an improved virtual scenario when the target audience of the VR environment consists of predominately younger people. However, Kober [41] and our work have been able to show both that there is need for further research on age-related effects on Presence to gain reliable knowledge. Another consideration is that since most participants in the Older group had not been exposed to VR based scenarios for a significant duration of their lifespan, which was significantly different in the Younger group, it may affect their adaptability to Presence and User Experience; it may also limit, to some extent, their ability to become familiar with an alien technology. These

PLOS ONE
Age and gender effects on presence, user experience and usability in virtual environments -first insights considerations also put our study in a static context, reflecting the experience of today's younger and older population. Likewise, in a few decades, the experience of a comparably older population may vary markedly from the contemporary experience of populations at different age groups.
In another study with a smaller number of participants (n = 20), Felnhofer et al. [40] investigated gender-related differences on Presence by giving a 5-minute speech in front of a virtual audience experienced with a Head-Mounted Display (HMD) and concluded that VR seems to be 'made for males' using the iGroup Presence Questionnaire (IPQ) [53]. Felnhofer et al. report a lower 'Sense of being there', 'Spatial Presence' and 'Realness' for women. Sagnier et al.

PLOS ONE
Age and gender effects on presence, user experience and usability in virtual environments -first insights [42] performed a study (n = 52) on gender-related difference in participants experienced with an HMD with a manual assembly task in the context of an aircraft using the Witmer & Singer presence questionnaire [54]. They report that 'self-assessment of performance' and 'ability to act' are lower for women. In contrast to Felnhofer et al. [40] and Sagnier et al. [42], we report no significant gender-related effects on Presence. Due to the different experimental tasks, the different presence questionnaires and VR technologies (HMDs vs. CAVE), these three studies and their contradicting findings are difficult to compare. However, there are four common points in the work of Felnhofer et al. and Sagnier et al. that are different from our presented work: (1) the experimental tasks were performed at one place and did not require the participant to move through the virtual environment: (2) the experimental tasks had serious consequential contexts; (3) the participants were only interacting with a virtual object; and (4) the participants could not see their real bodies. Neither Felnhofer et al. nor Sagnier et al. provided any information if the participants were presented with a virtual body or not. Since both studies also did not report on any kind of body tracking, we assume that, at best, the participants were presented with virtual hands. For our study, these four points were different: (1) moving through the virtual environment was a crucial part of the experimental task; (2) the geocaching game can be considered a leisure time and fun activity; (3) the participants used a real smart phone in context of the virtual environment; and (4) the participants were able to see their real bodies at all times. It would be too speculative to make assumptions on whether any of these four points are able to explain the different results found by Felnhofer et al., Sagnier et al., and in our study. However, future studies could investigate if there are circumstances where a VR experiences could lead to the differences in Presence based on gender. Furthermore, in the same study, Sagnier et al. [42] also investigated User Experience using the AttrakDiff2 [55] questionnaire. They found lower scores for women with 'hedonic quality stimulation', but not with 'hedonic quality identification' and 'pragmatic quality', which, in general, supports our results of not finding any differences User Experience or Usability based on gender. However, the deviating result for 'hedonic quality stimulation' calls for further investigation in future studies.
It seems highly relevant to find a conclusive answer to the question how age and gender influence Presence is crucial for developers of commercial VR applications and professional users (e.g., therapists) alike. The success of VR applications could be influenced negatively in case age-or gender-related adaptions are necessary for all users to fully enjoy them. In a worst case scenario, professional users including therapist might even improperly treat their patients due to age or gender related differences in the impact of the VR experience. Our results suggest that no gender aspects have to be considered by developers of commercial VR applications and professional users. However, in terms of age, the higher perceived Ecological Validity of a virtual experience must especially wary professional users not to overexcite users.
The results of this study for User Experience (UEQ) and Usability (SUS) of the Younger and Older groups were contrary. The UEQ showed higher results for the Older group, meaning that they liked the application better, which may have been related to the novelty of VR and geocaching. In contrast, the SUS yielded significantly higher scores in the Younger group. These finding for the UEQ and SUS may have resulted from the nature of the UEQ and SUS. The UEQ questionnaire asks on an abstract level for bipolar adjectives associated with the application. In contrast, the SUS consists of items that directly ask about the application, such as 'I think I would like to use this system frequently'. Whilst the UEQ reflects the older participants' subtle assessment of the application and their general enjoyment of the new experience, the SUS askes directly for their opinion. This difference is in contrast to the SUS findings that speak in favor of not using the application in the future. An alternative, speculative interpretation of the conflicting UEQ and SUS ratings could be that the older participants in general liked and enjoyed the geocaching game but did not see any real meaning in it.
In summary, the novel findings in this study are the following: • In contrast to existing literature, a decrease was observed in Presence with age for the current population with no previous VR experience. Instead, only a significant difference for the factor 'Ecological Validity' was found, which was higher for the Older group.
• In contrast to existing literature, no gender-effect on Presence was seen. Furthermore, no gender-effect on User Experience and Usability was observed.
• Age was found to have contradicting effects on Usability (lower) and on two User Experience Factors.
• Future research should be directed to clarify the effects of age and gender on Presence. Further, the reasons for the contradicting results of age effect on User Experience and Usability should be investigated.
A number of limitations need to be addressed for this study. The first limitation is in regards to the mode of locomotion used for the geocaching scenario. A few participants struggled with controlling their movements in VR with the Microsoft Kinect sensor-based navigation method. Furthermore, there were known technical limitations with the sensor itself, regarding movement recognition. Both issues did not affect the results in previous studies using the same navigation method [30,43] and could probably be solved in the future using a more stable tracking system. Second, some glass wearers interrupted the test for a few seconds when they re-adjusted the fit of their glasses and the VR-glasses that had to be worn additionally. Further, the Hawthorne effect might have influenced the results, as the users may have anticipated to be exposed to something novel and exciting, to the end that the participants rated their experience in favor of this anticipation. Lastly, our study must be seen as a snapshot in time, given that the general population might have gained their first experiences with VR that are not yet part of their everyday life. To investigate possible changes resulting from VR becoming more widespread, our study should be re-evaluated in 3-year intervals. Therefore, our work can only be seen as a starting point for long-term investigations on how the increasing exposure of VR affects Presence and its connection with User Experience and Usability.
Future studies should also use other Presence, User Experience and Usability questionnaires in conjunction with different study tasks to derive strongly reliable and generalizable statements. It would furthermore be very interesting to see if the found age-dependent difference for the Presence item of Ecological Validity would change if the participants were exposed to multiple environments over a longer period to compensate for different experience levels, possibly also using other VR devices like head mounted displays. Especially, the impact of technology affinity and a detailed differentiation of levels of VR-experience on Presence and its connection with User Experience and Usability, should be focused.

Conclusion
In this study, both age-and gender-related effects on Presence, User Experience and Usability in VR are jointly investigated for the first time. The body of literature investigating gender and age related is very limited and calls for further investigations. In contrast to existing literature, we could not prove gender-related differences on Presence, nor that Presence decreases with age. However, we present four discriminating factors of this work with the existing literature to further investigate possible gender related effects on Presence, User Experience and Usability. Our findings suggest that, for the most part, no major age-or gender-related differences exist on Presence. However, older participants seemed to find the VR environment more realistic than the younger participants. Further, no interaction effects were found, and only minor ageand gender-related influence on the results of the ITC-SOPI, UEQ and SUS questionnaires were found. The results for User Experience were higher for the older participants whilst for Usability the younger participants showed higher ratings.
Supporting information S1 Dataset. Minimal dataset.csv contains the minimal dataset used for the statistical evaluation and for deriving Figs 3-5. (CSV)